102 research outputs found

    Highly expressed proteins have an increased frequency of alanine in the second amino acid position

    Get PDF
    BACKGROUND: Although the sequence requirements for translation initiation regions have been frequently analysed, usually the highly expressed genes are not treated as a separate dataset. RESULTS: To investigate this, we analysed the mRNA regions downstream of initiation codons in nine bacteria, three archaea and three unicellular eukaryotes, comparing the dataset of highly expressed genes to the dataset of all genes. In addition to the detailed analysis of the nucleotide and codon frequencies we compared the N-termini of highly expressed proteins to the N-termini of all proteins coded in the genome. CONCLUSION: The most conserved pattern was observed at the amino acid level: strong alanine over-representation was observed at the second amino acid position of highly expressed proteins. This pattern is well conserved in all three domains of life

    SNPmasker: automatic masking of SNPs and repeats across eukaryotic genomes

    Get PDF
    SNPmasker is a comprehensive web interface for masking large eukaryotic genomes. The program is designed to mask SNPs from recent dbSNP database and to mask the repeats with two alternative programs. In addition to the SNP masking, we also offer population-specific substitution of SNP alleles in genomic sequence according to SNP frequencies in HapMap Phase II data. The input to SNPmasker can be defined in chromosomal coordinates or inserted as a sequence. The sequences masked by our web server are most useful as a preliminary step for different primer and probe design tasks. The service is available at and is free for all users

    Translation initiation region sequence preferences in Escherichia coli

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The mRNA translation initiation region (TIR) comprises the initiator codon, Shine-Dalgarno (SD) sequence and translational enhancers. Probably the most abundant class of enhancers contains A/U-rich sequences. We have tested the influence of SD sequence length and the presence of enhancers on the efficiency of translation initiation.</p> <p>Results</p> <p>We found that during bacterial growth at 37°C, a six-nucleotide SD (AGGAGG) is more efficient than shorter or longer sequences. The A/U-rich enhancer contributes strongly to the efficiency of initiation, having the greatest stimulatory effect in the exponential growth phase of the bacteria. The SD sequences and the A/U-rich enhancer stimulate translation co-operatively: strong SDs are stimulated by the enhancer much more than weak SDs. The bacterial growth rate does not have a major influence on the TIR selection pattern. On the other hand, temperature affects the TIR preference pattern: shorter SD sequences are preferred at lower growth temperatures. We also performed an <it>in silico </it>analysis of the TIRs in all <it>E. coli </it>mRNAs. The base pairing potential of the SD sequences does not correlate with the codon adaptation index, which is used as an estimate of gene expression level.</p> <p>Conclusion</p> <p>In <it>E. coli </it>the SD selection preferences are influenced by the growth temperature and not influenced by the growth rate. The A/U rich enhancers stimulate translation considerably by acting co-operatively with the SD sequences.</p

    GENOMEMASKER package for designing unique genomic PCR primers

    Get PDF
    BACKGROUND: The design of oligonucleotides and PCR primers for studying large genomes is complicated by the redundancy of sequences. The eukaryotic genomes are particularly difficult to study due to abundant repeats. The speed of most existing primer evaluation programs is not sufficient for large-scale experiments. RESULTS: In order to improve the efficiency and success rate of automatic primer/oligo design, we created a novel method which allows rapid masking of repeats in large sequence files, for example in eukaryotic genomes. It also allows the detection of all alternative binding sites of PCR primers and the prediction of PCR products. The new method was implemented in a collection of efficient programs, the GENOMEMASKER package. The performance of the programs was compared to other similar programs. We also modified the PRIMER3 program, to be able to design primers from lowercase-masked sequences. CONCLUSION: The GENOMEMASKER package is able to mask the entire human genome for non-unique primers within 6 hours and find locations of all binding sites for 10 000 designed primer pairs within 10 minutes. Additionally, it predicts all alternative PCR products from large genomes for given primer pairs

    Inparanoid: a comprehensive database of eukaryotic orthologs

    Get PDF
    The Inparanoid eukaryotic ortholog database (http://inparanoid.cgb.ki.se/) is a collection of pairwise ortholog groups between 17 whole genomes; Anopheles gambiae, Caenorhabditis briggsae, Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Takifugu rubripes, Gallus gallus, Homo sapiens, Mus musculus, Pan troglodytes, Rattus norvegicus, Oryza sativa, Plasmodium falciparum, Arabidopsis thaliana, Escherichia coli, Saccharomyces cerevisiae and Schizosaccharomyces pombe. Complete proteomes for these genomes were derived from Ensembl and UniProt and compared pairwise using Blast, followed by a clustering step using the Inparanoid program. An Inparanoid cluster is seeded by a reciprocally best-matching ortholog pair, around which inparalogs (should they exist) are gathered independently, while outparalogs are excluded. The ortholog clusters can be searched on the website using Ensembl gene/protein or UniProt identifiers, annotation text or by Blast alignment against our protein datasets. The entire dataset can be downloaded, as can the Inparanoid program itself

    Evaluating the performance of commercial whole-genome marker sets for capturing common genetic variation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>New technologies have enabled genome-wide association studies to be conducted with hundreds of thousands of genotyped SNPs. Several different first-generation genome-wide panels of SNPs have been commercialized. The total amount of common genetic variation is still unknown; however, the coverage of commercial panels can be evaluated against reference population samples genotyped by the International HapMap project. Less information is available about coverage in samples from other populations.</p> <p>Results</p> <p>In this study we compare four commercial panels: the HumanHap 300 and HumanHap 550 Array Sets from the Illumina Infinium series and the Mapping 100 K and Mapping 500 K Array Sets from the Affymetrix GeneChip series. Tagging performance is compared among HapMap CEPH (CEU), Asian (JPT, CHB) and Yoruba (YRI) population samples. It is also evaluated in an Estonian population sample with more than 1000 individuals genotyped in two 500-kbp ENCODE regions of chromosome 2: ENr112 on 2p16.3 and ENr131 on 2p37.1.</p> <p>Conclusion</p> <p>We found that in a non-reference Caucasian population, commercial SNP panels provide levels of coverage similar to those in the HapMap CEPH population sample. We present the proportions of universal and population-specific SNPs in all the commercial platforms studied.</p

    Detection of tmRNA molecules on microarrays at low temperatures using helper oligonucleotides

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The hybridization of synthetic <it>Streptococcus pneumoniae </it>tmRNA on a detection microarray is slow at 34°C resulting in low signal intensities.</p> <p>Results</p> <p>We demonstrate that adding specific DNA helper oligonucleotides (chaperones) to the hybridization buffer increases the signal strength at a given temperature and thus makes the specific detection of <it>Streptococcus pneumoniae </it>tmRNA more sensitive. No loss of specificity was observed at low temperatures compared to hybridization at 46°C. The effect of the chaperones can be explained by disruption of the strong secondary and tertiary structure of the target RNA by the selective hybridization of helper molecules. The amplification of the hybridization signal strength by chaperones is not necessarily local; we observed increased signal intensities in both local and distant regions of the target molecule.</p> <p>Conclusions</p> <p>The sensitivity of the detection of tmRNA at low temperature can be increased by chaperone oligonucleotides. Due to the complexity of RNA secondary and tertiary structures the effect of any individual chaperone is currently not predictable.</p

    The mitochondrial genome of the venomous cone snail conus consors

    Get PDF
    Cone snails are venomous predatory marine neogastropods that belong to the species-rich superfamily of the Conoidea. So far, the mitochondrial genomes of two cone snail species (Conus textile and Conus borgesi) have been described, and these feed on snails and worms, respectively. Here, we report the mitochondrial genome sequence of the fish-hunting cone snail Conus consors and describe a novel putative control region (CR) which seems to be absent in the mitochondrial DNA (mtDNA) of other cone snail species. This possible CR spans about 700 base pairs (bp) and is located between the genes encoding the transfer RNA for phenylalanine (tRNA-Phe, trnF) and cytochrome c oxidase subunit III (cox3). The novel putative CR contains several sequence motifs that suggest a role in mitochondrial replication and transcription
    corecore